Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix MastForest serialization #1482

Merged
merged 19 commits into from
Sep 10, 2024

Conversation

plafer
Copy link
Contributor

@plafer plafer commented Sep 6, 2024

Builds on #1466.

Stores the decorators in a separate buffer during serialization.

@plafer plafer added the no changelog This PR does not require an entry in the `CHANGELOG.md` file label Sep 6, 2024
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Not a full review but I've left some comments inline.

Also, maybe I missed it, but I think the current code doesn't actually serialize/deserialize decorators for non-basic block nodes.

For this, we'd need to include before/after decorator offsets to MastNodeInfo like so:

pub struct MastNodeInfo {
    ty: MastNodeType,
    digest: RpoDigest,
    before_decorators_offset: u32,
    after_decorators_offset: u32,
}

But then these new fields would not be really used for basic block nodes (though, maybe we can still use them and get rid of decorator_list_offset field for MastNodeType::Block).

Another thing to consider is that the formats of decorator lists for basic blocks and other nodes are different - i.e., in one case it would be just a list of DecoratorIds and in the other it would be a list of (usize, DecoratorId) tuples. But maybe this doesn't matter and we can make the data context-dependent.

core/src/mast/mod.rs Outdated Show resolved Hide resolved
core/src/mast/serialization/mod.rs Show resolved Hide resolved
core/src/mast/serialization/basic_block_data_builder.rs Outdated Show resolved Hide resolved
core/src/mast/serialization/basic_block_data_decoder.rs Outdated Show resolved Hide resolved
@plafer
Copy link
Contributor Author

plafer commented Sep 9, 2024

Also, maybe I missed it, but I think the current code doesn't actually serialize/deserialize decorators for non-basic block nodes.

Correct, this is still left to be done in this PR.

For this, we'd need to include before/after decorator offsets to MastNodeInfo like so:

This would add 8 bytes to MastNodeInfo, even though the vast majority would be 0 (and all of them would be 0 in release builds after #1457). I was thinking instead of serializing 2 BTreeMaps at the MastForest level:

  • before_enter: BTreeMap<MastNodeId, Vec<DecoratorId>>
  • after_exit: BTreeMap<MastNodeId, Vec<DecoratorId>>

Upon deserialization, after deserializing all MastNodes, we can run through these maps and use MastForest::{set_before_enter, set_after_exit}. Then, the size of the binary increases proportionally to the number of decorators in the MAST forest.

@bobbinth
Copy link
Contributor

bobbinth commented Sep 9, 2024

This would add 8 bytes to MastNodeInfo, even though the vast majority would be 0 (and all of them would be 0 in release builds after #1457). I was thinking instead of serializing 2 BTreeMaps at the MastForest level:

  • before_enter: BTreeMap<MastNodeId, Vec<DecoratorId>>
  • after_exit: BTreeMap<MastNodeId, Vec<DecoratorId>>

Ah yes - that's a better solution!

@plafer plafer marked this pull request as ready for review September 9, 2024 22:06
@plafer plafer requested a review from bobbinth September 9, 2024 22:06
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! I left some small comments inline, once these are addressed, we can merge.

Also, I'm curious how this change impacted both deserialization time and execution time (I would think that the impacts should be negligible, but would be good to confirm).

core/src/mast/node/basic_block_node/mod.rs Outdated Show resolved Hide resolved
core/src/mast/serialization/info.rs Show resolved Hide resolved
core/src/mast/serialization/basic_blocks.rs Outdated Show resolved Hide resolved
core/src/mast/serialization/mod.rs Outdated Show resolved Hide resolved
core/src/mast/serialization/mod.rs Outdated Show resolved Hide resolved
core/src/mast/serialization/mod.rs Outdated Show resolved Hide resolved
core/src/mast/serialization/mod.rs Outdated Show resolved Hide resolved
core/src/mast/serialization/mod.rs Outdated Show resolved Hide resolved
core/src/mast/serialization/mod.rs Outdated Show resolved Hide resolved
@plafer
Copy link
Contributor Author

plafer commented Sep 10, 2024

For the blake3 benchmark (100 iterations), both next and this branch we just about 5450ms execution time

@plafer plafer merged commit 564ac46 into plafer-1122-fix-decorators Sep 10, 2024
9 checks passed
@plafer plafer deleted the plafer-fix-mastforest-serialization branch September 10, 2024 13:01
plafer added a commit that referenced this pull request Sep 11, 2024
* feat: move decorators to `MastForest`

* `DecoratorList` uses `DecoratorId`

* feat: add `before_enter` and `after_exit` decorators to MAST nodes

* WIP: `MastForestBuilder::make_basic_block()` returns `Either<Option<MastNodeId>, Vec<DecoratorId>>`

Still WIP on fixing `compile_body()` - specifically handling what to do when `make_basic_block()` returns a list of decorators

* fix build

* fix build after merge

* fix repeat statement

* implement `before_enter` and `after_exit`

* changelog

* fix no_std

* print decorators in BasicBlockNode

* join node pretty print

* basic block and repeat tests

* join decorators test

* implement pretty print for all nodes

* call node pretty print

* dyn node pretty print

* decorators external node

* join and split node pretty print

* fix `eq_hash`, and test `decorators_repeat_split`

* `MastForestBuilder`: add_decorator -> ensure_decorator

* fix asmop hash

* fix docs

* fmt

* add failing test

* eq_hash: now takes into account children's eq hashes

* remove redundant comment

* fix `MastForestBuilder::ensure_node` docs

* expand on `add_library` comment

* Remove `Either` type

* `add_block_with_raw_decorators` clarifying comment

* Move `&mut MastForestBuilder` into `BasicBlockBuilder`

* split `add_library()` docstring into 3 paragraphs

* Fix `MastForest` serialization (#1482)

* Move strings data to its own buffer

* move StringTableBuilder out of BasicBlockDataBuilder

* Create `StringTable` type

* Encode decorators separately

* fix strings table

* Introduce Operation iterator in BasicBlockNode

* Document use of `MAX_DECORATORS` value

* fix typo

* merge `basic_block_data_decoder` and `basic_block_data_builder` modules

* Use `DecoratorId::from_u32_safe` in deserialization

* serialization failing test

* serialize before_enter/after_exit maps

* Document serialization format

* rename `BasicBlockNode::operation_iter`

* `MastNodeInfo::new()`: soft enforce 0 values for non-basic block nodes

* update docstring for `BasicBlockDataDecoder`

* decorator id deserializing meaningful errors

* move `basic_block_data_decoder` var

* Don't use maps for before enter/after exit decorators

* block_builder var name

* Introduce new `EqHash`

* Cleanup `BasicBlockBuilder::make_block()`

* Incorporate assert and mpverify arguments into eq_hash

* update code organization docs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no changelog This PR does not require an entry in the `CHANGELOG.md` file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants